Sex-specific genetic modifiers identified susceptibility of cold stored red blood cells to osmotic hemolysis

Background Genetic variants have been found to influence red blood cell (RBC) susceptibility to hemolytic stress and affect transfusion outcomes and the severity of blood diseases. Males have a higher susceptibility to hemolysis than females, but little is known about the genetic mechanism contributing to the difference. Results To investigate the sex differences in RBC susceptibility to hemolysis, we conducted a sex-stratified genome-wide association study and a genome-wide gene-by-sex interaction scan in a multi-ethnic dataset with 12,231 blood donors who have in vitro osmotic hemolysis measurements during routine blood storage. The estimated SNP-based heritability for osmotic hemolysis was found to be significantly higher in males than in females (0.46 vs. 0.41). We identified SNPs associated with sex-specific susceptibility to osmotic hemolysis in five loci (SPTA1, KCNA6, SLC4A1, SUMO1P1, and PAX8) that impact RBC function and hemolysis. Conclusion Our study established a best practice to identify sex-specific genetic modifiers for sexually dimorphic traits in datasets with mixed ancestries, providing evidence of different genetic regulations of RBC susceptibility to hemolysis between sexes. These and other variants may help explain observed sex differences in the severity of hemolytic diseases, such as sickle cell and malaria, as well as the viability of red cell storage and recovery. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08461-4.


Background
Red blood cell (RBC) response to canonical in vitro stressors, such as cold storage, osmotic hemolysis, and oxidative hemolysis, has been associated with altered RBC survival after transfusion [1][2][3]. In humans, in vitro osmotic stress is a highly reproducible trait [4] that can be further mediated by study donor characteristics such as sex, ancestry, age, donation history, and genetic factors that regulate RBC integrity and functions.
Sex is an influential factor for hemolysis, as males have enhanced susceptibility to hemolysis [1]. The sex dichotomy in RBC susceptibility to hemolysis is present in cold stored RBCs and in hemolytic diseases including sickle cell anemia [13][14][15][16][17]. It is largely unknown if the sex bias is mediated by different genetic mechanisms. In this study, we aimed to identify genetic variants regulating RBC function and hemolysis in a sex-specific manner utilizing data from blood donors in the National Heart, Lung, and Blood Institute RBC-Omics cohort [18]. This diverse, multi-ethnic cohort has 12,231 blood donors with European, African, Hispanic, and Asian ancestry. We evaluated both the sex-stratified GWAS strategy and genome-wide gene-by-sex interaction scan for osmotic analysis during routine blood storage, established the best practice for sex-specific genetic analysis in different scenarios, and characterized the common and different genetic variants regulating RBC hemolysis between male and female sex.

Quantified SNP-specific effect size and p-value differences derived from sex-stratified GWAS
Based on the sex-stratified GWAS, we compared the differences between effect sizes and p-values at each SNP across the genome. Assuming no relatedness in RBC-Omics data, we first tested the effect size differences between male-and female-specific GWAS (Eq. (2) in Methods). The QQ plot shows no systematic bias, and there is no genome-wide significant difference found for effect sizes ( Supplementary Fig. S2). The test for p-value difference, based on Eq.
(3) in Methods, identified two genome-wide significant ( p < 5 × 10 −8 ) loci-SPTA1 and KCNA6-and three suggestive ( p < 5 × 10 −7 ) loci-PAX8, SLC4A1, and SUMO1P1 (Table 1, Fig. 2). SPTA1 is known to modulate RBC function [11], and our results show it has significant effects in both males and females. KCNA6 is a member of the Potassium Voltage-Gated Channel family, which is involved in cell volume regulation, including RBCs [20]. Results show that genetic variants around KCNA6 are significantly associated with osmotic hemolysis in females but not in males. The same female-specific associations are observed around the gene SLC4A1, which encodes for erythrocyte band 3 protein that plays a pivotal role in regulating anion transport across membranes, and for which mutations are associated with hereditary spherocytosis and the Diego blood group [21][22][23]. Conversely, genetic variants surrounding the other two genes, SUMO1P1 (SUMO1 Pseudogene 1), which despite its name is spliced and translated, and PAX8 (member of the paired box family of transcription factors), demonstrated association with osmotic hemolysis only in males. Taking SLC4A1 as an example, Fig. 3 illustrates the difference in GWAS results for osmotic hemolysis between males and female and the difference in gene expression in whole blood.

Genome-wide gene-by-sex interaction scan
We performed a genome-wide gene-by-sex interaction scan using Eq. (4) (see Methods). The joint analysis for SNP main effects and SNP × Sex interactions indicated an inflation problem ( Supplementary Fig. S3, = 1.55 ). Although the results from the joint analysis revealed many consistent loci previously reported to be associated with osmotic hemolysis [5], the global inflation indicates some confounding factors not appropriately controlled in interaction analysis [26], largely because of the diverse, multi-ethnic participants in the RBC-Omics dataset. The top 10 principal components (PCs) accounting for population stratification in regular GWAS are not enough to control the confounding effects in the interaction analysis. To illustrate the influence of the multi-ethnic population structure in the context of interaction analysis, we conducted ancestry-specific gene-by-sex interaction scan in individuals labeled as non-Hispanic White (EUR, N = 7,598) and African American (AFR, N = 1,036) separately. The genetic ancestry was defined by clustering analysis of the RBC-Omics population overlaid on the 1000 Genome phase 3 samples [5]. The QQ plots for geneby-sex interaction analyses in both EUR and AFR have no inflation ( Supplementary Fig. S4), indicating that the complicated population structure in a multi-ethnic dataset like RBC-Omics was not appropriately controlled using the regular PCs in the interaction model.

Discussion
This study identified sex-specific genetic determinants of hemolysis that may explain the observed sex dichotomy in RBC susceptibility to osmotic hemolysis of cold stored red cells. We conducted both genome-wide gene-by-sex scan and sex-stratified GWAS for osmotic hemolysis using the REDS-III RBC-Omics cohort. The SNP-based estimated heritability was significantly higher in males than in females ( h 2 g = 0.46 vs. h 2 g = 0.41 , p < 2 × 10 −16 ). We have developed a statistic to compare p-values from sex-stratified GWAS (Eq. 3). The statistic was used to conduct a genome-wide scan of the genetic variants that quantified differences in the significance of the associations between the sexes. Loci near KCNA6 and SLC4A1 were identified as having female-specific associations with osmotic hemolysis. The top SNP at the SLC4A1 locus, rs13306780, is associated with nine RBC traits according to GWAS Catalog [27]. Our study also identified two male-specific loci associated with osmotic hemolysis, near SUMO1P1 and PAX8. The gene SUMO1P1 was recently identified as a new member of the small ubiquitin-like modifiers (SUMOs) family, with exceptionally high expression levels in testes and peripheral blood leukocytes [28]. It is involved in the formation and disruption of promyelocytic leukemia-based nuclear structures that regulate various cellular processes. In GWAS Catalog, SUMO1P1 is associated with RBC traits like mean corpuscular hemoglobin, red cell distribution width, and mean corpuscular volume. The other gene, PAX8, is a transcription factor associated with hemoglobin, RBC count, and hematocrit [29,30]. Interestingly, these genes with sex-specific associations with hemolysis have significant differences ( p < 0.05 ) in gene expression between males and females in whole blood according to GTEx, except for KCNA6, which does not have expression data (Supplementary Table 2). Thus, the validity of the genetic variants identified in this study is supported by the known associations with RBC measurements and the gene expression data in whole blood. These genetic variants likely underlie mechanisms that regulate osmotic hemolysis of RBCs.
One limitation of the statistic developed is that it may identify significant genetic variants in both sexes but to different extents. For example, the genetic variants around SPTA1 reached genome-wide significance in both shows significantly (p = 0.039) more expression (Transcripts Per Million) in whole blood in females than males according to GTEx [25] male-and female-specific GWAS, with more extreme p-values in the female sex. The difference is unlikely caused by detection power, because there were similar sample sizes for both sexes. The expression data for SPTA1 did not show a difference in whole blood between the sexes. However, in sex-biased eQTL analysis results from GTEx [31], there is one SNP, rs863327, identified as an eQTL for SPTA1 in whole blood only in females ( p = 0.009 ), but not in males ( p < 0.3 ). In our sexstratified GWAS results, the same SNP, rs863327, was significantly associated with osmotic hemolysis in both males ( p = 6.89 × 10 −9 ) and females ( p = 1.54 × 10 −13 ). Therefore, the observed difference in the significance levels of associations may indicate potential variation of the underlying regulation mechanisms between the sexes.
Although the comparison of effect sizes between maleand female-specific GWAS for osmotic hemolysis did not show a significant difference, the tests did reveal genetic variants with opposite effects between the sexes (Supplementary Table 1). The top hit was around the gene SCFD1 (Sec1 Family Domain Containing 1), which is related to the metabolism of protein pathways. Although the gene expression of SCFD1 in whole blood did not show a difference between males and females ( p = 0.84 ), the sex-biased eQTL analysis indicated the existence of sex-specific eQTL for SCFD1 in whole blood [31].
We performed a genome-wide gene-by-sex interaction scan for osmotic hemolysis using the RBC-Omics data. However, the heterogeneity of ancestry in such a multiethnic dataset caused systematic inflation of the p-values for the interaction term SNP × sex . To properly control the population stratification in the interaction analysis, covariate-by-gene, covariate-by-sex, PC-by-gene, and PC-by-sex interaction terms should be included in the same model. However, such options are currently limited by available software and heavy computing burden. Recently published work in another multi-ethnic dataset, the Population Architecture using Genomics and Epidemiology study, has demonstrated the benefits and importance of conducting genome-wide genetic analysis in diverse populations to maximize genetic findings and reduce health disparities [32]. Our study raised the question of how to properly control confounding factors when conducting "MEGA" genome-wide interaction analysis in such multi-ethnic datasets. Further development of methods is needed to address the issues, for we believe these confounding effects could hold for any covariate uses for interaction analysis.
Another limitation of the study is the lack of a replication cohort for the in vitro hemolysis measures since RBC-Omics is the first study to explore stress hemolysis as a quantitative trait. Therefore, follow-up studies are needed to validate the findings in the present study.

Conclusions
In summary, we have assessed sex-specific genetic associations for RBC susceptibility to osmotic hemolysis in RBC-Omics. The ethnically diverse populations in RBC-Omics provide a comprehensive evaluation of the genetic factors but also limit the usage of standard genomewide gene-by-sex interaction scan method because of the improper control for population stratification in the interaction model. Therefore, we implemented sex-stratified GWAS and then compared both the effect sizes and p-values for each genetic variant between the sexes across the genome. The resulted unbiased QQ-plots indicated the validity of the derived statistics. Using this methodology, we found sex heterogeneity for osmotic hemolysis in five loci: SPTA1, KCNA6, SLC4A1, SUMO1P1, and PAX8. Our results reinforce the need to consider sex-specific associations in characterizing the genetic architecture for sexually dimorphic traits like osmotic hemolysis. Furthermore, the identified loci with sex-specific associations shed light on potential biological mechanisms for understanding the sex differences of osmotic hemolysis with implications for efficacy of RBC transfusions and, more important, relevant to understanding sex differences in penetrance and severity of genetic and acquired hemolytic diseases as well as infectious diseases such as malaria [33][34][35].

REDS-III RBC-Omics cohort and blood osmotic hemolysis measurement
The REDS-III RBC-Omics study aimed to improve blood transfusion safety by evaluating the association of donor characteristics (e.g., sex, age, race/ethnicity) on blood storage quality and post-transfusion outcomes. The RBC-Omics cohort consists of a multi-ethnic population (12% African American, 12% Asian, 8% Hispanic, 64% white, and 5% multiracial/other) of blood donors with well-characterized demographic, behavioral, and donation history [18]. In total, 13,403 healthy blood donors over the age of 18 were enrolled at four U.S. blood centers. The details for genetic data QC and imputation were described in detail in Page et al. [5]. We removed related samples by keeping one relative per family based on the relatedness estimation using identity-by-descent/identity-by-state (IBD/IBS). For this study, the final informative sample size is 12,231. Table 2 describes the sample characteristics.
As one of the measures indicating blood storage quality, RBC osmotic hemolysis is defined by the loss of hemoglobin in response to reduced osmotic pressure. In REDS-III RBC-Omics, osmotic hemolysis was determined in vitro as the rate of osmotic hemolysis following incubation of washed RBCs (stored for 39-42 days at 1-6 Fang et al. BMC Genomics (2022) 23:227 • C ) in a modified pink test buffer [13], and the measure ranges from 0 to 100%.
Using a multivariable linear model, our previous study [13] has demonstrated that males and older age groups have higher osmotic hemolysis, African American and Asian ethnicity and donation history are negatively associated with osmotic hemolysis. Thus, these modifiers are all included in our models in this study.

Sex-stratified GWAS
In each sex, linear regression was used to test the association between each SNP and osmotic hemolysis by the software ProbABEL [36]. Models were adjusted for age, donation history, ancestry, and sex-specific top 10 ancestry PCs.

Approaches to identify differences between sex-stratified GWAS
A general method to detect the differences between the GWAS results stratified by sex is the statistical test for the effect sizes [37]: where β male is the effect size of the genetic variant in male-specific GWAS, and SE male is the corresponding standard error. The term r • SE male • SE female is an estimate of the covariance between β male and β female , which accounts for the relatedness among samples; r is the Spearman rank correlation coefficient across all SNPs. If assuming no relatedness in the dataset, the test can be simplified to In addition to the comparison of the effect sizes, we developed a statistic to compare the p-values between sex-stratified GWAS: where p males is the p-value of the genetic variant in male-specific GWAS. Under the assumption that both p males and p females follow uniform distribution, the statistic u also follows uniform distribution U [0, 1].

Genome-wide gene-by-sex interaction scan for the sex difference in osmotic hemolysis
A joint analysis approach simultaneously testing on both SNP main effects and SNP x environment interactions has been employed in gene-environment interaction studies [38][39][40]. We used the same approach to detect a joint effect of SNP and SNP × Sex interactions on osmotic hemolysis: where Y is the osmotic hemolysis; Cov stands for a set of covariates, such as age, number of donations during the past 2 years, and top 10 PCs accounting for population stratification. The estimation of the coefficients, β SNP , β g×s and their covariance matrix can be used to construct a Wald's statistic, which follows a χ 2 -distribution with two degrees of freedom. The genome-wide gene-by-sex interaction scan was conducted with ProbABEL [36] using the option "-interaction. " Then a customized script calculates Wald's statistic and corresponding p-values based on the output coefficients and covariance estimated.